Segmentation of genomic DNA through entropic divergence: power laws and scaling.

نویسندگان

  • Rajeev K Azad
  • Pedro Bernaola-Galván
  • Ramakrishna Ramaswamy
  • J Subba Rao
چکیده

Genomic DNA is fragmented into segments using the Jensen-Shannon divergence. Use of this criterion results in the fragments being entropically homogeneous to within a predefined level of statistical significance. Application of this procedure is made to complete genomes of organisms from archaebacteria, eubacteria, and eukaryotes. The distribution of fragment lengths in bacterial and primitive eukaryotic DNAs shows two distinct regimes of power-law scaling. The characteristic length separating these two regimes appears to be an intrinsic property of the sequence rather than a finite-size artifact, and is independent of the significance level used in segmenting a given genome. Fragment length distributions obtained in the segmentation of the genomes of more highly evolved eukaryotes do not have such distinct regimes of power-law behavior.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compositional segmentation and long-range fractal correlations in DNA sequences.

A segmentation algorithm based on the Jensen-Shannon entropic divergence is used to decompose longrange correlated DNA sequences into statistically significant, compositionally homogeneous patches. By adequately setting the significance level for segmenting the sequence, the underlying power-law distribution of patch lengths can be revealed. Some of the identified DNA domains were uncorrelated,...

متن کامل

Segmentation of DNA into Coding and Noncoding Regions Based on Recursive Entropic Segmentation and Stop-Codon Statistics

Heterogeneous DNA sequences can be partitioned into homogeneous domains that are comprised of the four nucleotides A, C, G, and T and the stop codons. Recursively, we apply a new entropic segmentation method on DNA sequences using Jensen-Shannon and Jensen-Rényi divergences in order to find the borders between coding and noncoding DNA regions. We have chosen 12and 18-symbol alphabets that captu...

متن کامل

Scaling theory of DNA confined in nanochannels and nanoslits.

A scaling analysis is presented of the statistics of long DNA confined in nanochannels and nanoslits. It is argued that there are several regimes in between the de Gennes and Odijk limits introduced long ago. The DNA chain folds back on itself giving rise to a global persistence length that may be very large owing to entropic deflection. Moreover, there is an orientational excluded-volume effec...

متن کامل

Image Registration and Segmentation by Maximizing the Jensen-Rényi Divergence

Information theoretic measures provide quantitative entropic divergences between two probability distributions or data sets. In this paper, we analyze the theoretical properties of the Jensen-Rényi divergence which is defined between any arbitrary number of probability distributions. Using the theory of majorization, we derive its maximum value, and also some performance upper bounds in terms o...

متن کامل

تحلیل رفتار DNA در گذر از ریز ساختارها بر اساس معادله فوکر-پلانک و مدل سد آنتروپی

We considered the motion of DNA molecules through a hexagonal array under uniform electric fields as a Fokker-Planck process which is affected by the entropic barriers and we have simulated this motion by computer. We solved the Fokker-Planck equation with numerical simulation of the Brownian dynamics by the Euler method. For different DNA molecules, under different physical conditions, the mea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical, nonlinear, and soft matter physics

دوره 65 5 Pt 1  شماره 

صفحات  -

تاریخ انتشار 2002